Declustering Databases on Heterogeneous Disk Systems
نویسندگان
چکیده
Declustering is a well known strategy to achieve maximum I/O parallelism in multidisk systems. Many declustering methods have been proposed for symmetrical disk systems, i.e, multi-disk systems in which all disks have the same speed and capacity. This work deals with the problem of adapting such declustering methods to work in heterogeneous environments. In such environments there are many types of disks and servers with a large range of speeds and capacities. We deal first with the case of perfectly declustered queries, i.e., queries which retrieve a fixed proportion of the answer from each disk. We propose an algorithm which determines the fraction of the dataset which must, be loaded on each disk. The algorithm may be tailored to find disk loading for minimal response time for a given da.tabase size, or to compute a system profile showing the optimal loading of the disks for all possible ranges of database sizes. The support of the Defense Advanced Research Projects Agency, as well as the support of the Department of Energ-y under contract DE-AC03-7GSF00098 is gratefully acknowledged. Permission to copy without fee all OT part of this ,material is granted provided that the copies WY not made OT distributed JOT &eel commercial aavantaye, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission oj the Very Lal-ge Data Base Endowment. To copy otherwise, or to republish, requires a fee d/oT special permission from the Endowment. Proceedings of the 21st VLDB Conference Zurich, Switzerland, 1905 The methods proposed here are general and can be used in conjunctCon with most known symmetric declustering methods.
منابع مشابه
Iterative-improvement-based declustering heuristics for multi-disk databases
Data declustering is an important issue for reducing query response times in multi-disk database systems. In this paper, we propose a declustering method that utilizes the available information on query distribution, data distribution, data-item sizes, and disk capacity constraints. The proposed method exploits the natural correspondence between a data set with a given query distribution and a ...
متن کاملDeclustering Large Multidimensional Data Sets for Range Queries over Heterogeneous Disks
Declustering is a technique to distribute data sets over multiple disks so that future retrievals can be well balanced over the disks and be performed in parallel. Although disk heterogeneity often exists in systems like clusters, most work on declustering has focused only on homogeneous environments. In this work, we investigate the declustering problem for a heterogeneous disk environment usi...
متن کاملDeclustering Objects for Visualization
In this paper we propose a new declustering method which is particularly suitable for image and cartographic databases used for visualization. Our declustering method is based on algebraic techniques using vectors. The algorithm which computes the disk assignment requires O(Kj log K) time where K is the number of parallel disks in the system. The resulting disk assignment maximizes the area tha...
متن کاملData Replication and Delay Balancing in Heterogeneous Disk Systems
Declustering and replication are well known techniques used to improve response time of queries in parallel disk environments. As data replication incurs a penalty for updates, database designers face the problem of finding which part of the database to load on each disk and how these parts should be replicated. This problem becomes more complicated in heterogeneous environments where disks hav...
متن کاملIndependent Study Report: Propose Partial Replication Schemes for Replicated Declustering and Compare Their Performance
In this course, under the supervision of our professor, we have focused on the techniques for declustering data into multiple disks. Our aim is to get familiar with all the related work done till today so we have started from the very beginning of declustering which is done on relational databases and Cartesian product files [4, 5]. We have explored the literature and find out the most outstand...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995